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This paper presents algorithms for predicting luminance changes 
in successive television frames. The changes can result when objects 
in a TV scene move or when illumination varies. By a gradient 
search technique, which seeks to minimize a functional of the inter- 
frame prediction error, we estimate two parameters associated with 
these luminance changes — displacement and gain. Using the esti- 
mates of these parameters, we also develop, for interframe coding, 
adaptive predictors and a segmentor to determine which pels need to 
be transmitted. We describe several coder variations and compare 
them by computer simulations using three substantially different 
scene sequences. For these sequences, gain compensation with im- 
proved segmentation reduced the bit rate of a conditional replenish- 
ment encoder by 50.7, 11.1, and 39.3 percent. Displacement compen- 
sation reduced the bit rate by 61.0, 24.8, and 14.5 percent. Combined 
gain and displacement compensation reduced the bit rate by 63.4, 
32.2, and 44.6 percent. 

I. INTRODUCTION 

Television signals contain a significant amount of frame-to-frame 
redundancy because of the 60-Hz field rate used to eliminate flicker. 
Some of this redundancy can be removed by techniques of conditional 
replenishment, 1 " 5 in which only picture elements that have changed by 
at least a threshold amount are transmitted. Conditional replenish- 
ment has recently been improved by displacement compensation. 6 " 13 
The displacement compensation schemes have been developed pri- 
marily to compensate for translational displacement of objects in 
uniform illumination. This is done by estimating the translation of an 
object in the scene and using it for predictive coding by taking 
differences of elements with respect to appropriately displaced ele- 
ments in the previous frame. Such schemes have been developed in 
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both the picture element (pel) domain 1011 and the transform 
domain. 1213 

In this paper, we extend the translational displacement model of the 
picture intensity and develop recursive algorithms for estimating the 
parameters associated with the extended model. The extended model 
incorporates spatial and temporal variations of illumination, as well as 
translational displacement of moving objects. Illumination acts as a 
multiplicative factor or gain on the reflectance of objects in the scene. 
The extended model thus has two parameters, gain and displacement. 
The coder estimates these two parameters recursively from the pre- 
viously transmitted data so that no additional information need be 
transmitted to specify them. 

Illumination and displacement variations occur to different degrees 
in different television scenes and, therefore, the efficiency of these new 
compensation schemes varies from scene to scene. Three scenes con- 
taining a sequence of 60 frames each have been used to evaluate the 
efficiency of the new coding algorithms. These sequences contain 
distinctly different types of motion of objects and frame-to-frame 
variation of illumination. We summarize our findings as follows. En- 
coders using gain compensation (improved prediction and segmenta- 
tion) alone require a slight increase in hardware compared to the 
conditional replenishment encoders but are capable of reducing the bit 
rate by about 11 to 51 percent, depending upon the scene. Displace- 
ment compensation encoders 1011 are more complex and reduce the bit 
rate by 15 to 61 percent compared to the conditional replenishment. 
Gain and displacement compensation together reduce the bit rate by 
about 32 to 63 percent. These improvements in bit rates are a function 
of the type of scene. In one of the scenes, the gain and displacement 
compensation has reduced the peak bit rate by about 75 percent 
compared to conditional replenishment. 

II. LUMINANCE VARIATION AND ITS COMPENSATION 

Earlier work 1011 introduced a recursive technique for estimating the 
translational displacement of objects moving in uniform illumination. 
The image intensity in the present frame at a spatial location x and 
time t was assumed to be equal to the spatially displaced intensity at 
time t-r 

I (x, t) =I(x-D,t- t), (1) 

where D is the displacement of the object and t is either the field or 
the frame interval. The steepest descent algorithm for recursive esti- 
mation of displacement is given by 

D' = D' 1 - e-DFD(x, ft'" 1 ) V/(x - D' \ t - t), (2) 

where the displaced field or frame difference, dfd (•,•), is defined by 
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DFD(X, D) - J(X, t) - J(x - D, t - t), (3) 

e is a small positive constant influencing convergence rate, V7 is the 
spatial gradient of the intensity, and D' is the tth estimate of D. 

In several situations, the above model of image intensity is not 
adequate. The three examples of this that follow involve certain 
conditions on illuminance L(x, t), and reflectance R(x, t) comprising 
the scene. 

(i) Time modulation of illuminance, reflectance being constant with 
respect to time. This situation corresponds to shadows created in the 
background of a scene as a result of a moving object. It is modeled by 

J(x, t) = L(t)R{x) (4) 

/(x, t-r) = L(t-T)R(x) (5) 



and, therefore, 



I(x,t)= ^ Z(x,f-T). (6) 

L(t- t) 



(ii) Translational displacement of reflectance due to object motion; 
spatially nonuniform but temporally constant illumination. This occurs 
quite commonly, since the illumination is generally not perfectly 
uniform. It is modeled by 

J(x, t) = L(x)R(x, t) (7) 

J(x, t- t) =L(x)i?(x + D, t) (8) 

and, therefore, 

/(x ' *> = if 1 ™™ /(x " D » ' " T) (9) 

L(x - D) 



or, alternatively, 



/(x, t) = * { **£ /(x, t - r). (10) 

R(x + D, t) 



(Hi) Translational displacement of illumination, spatially nonuni- 
form but temporally constant reflectance. This is the dual of case (ii) 
and can be caused by a shadow created in the background by a moving 
object. It is modeled by 

7(x, t) = L(x, t)R(±) (11) 

I(x, J-t) = L(x + D, t)R(x) (12) 

and, therefore, 

7 < x ' V = P /* (X) i^ 7 < x " D > * ~ T ) (13 > 

i?(x - D) 
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or, alternatively, 

^'»=i^b) /(x ''- T) - (14> 

Thus, in several common cases, a multiplicative factor, which may 
vary temporarily and spatially, describes the intensity changes. We are 
therefore motivated to generalize the model of eq. (1) to 

J(x, t) = pil(x, t - t) (15a) 

and 

/(x, t) = p 2 /(x - D, t - t). (15b) 

Equations (15a) and (15b) are alternative models of frame-to-frame 
intensity variation. However, they are not equivalent alternatives. In 
a typical television scene, different parts of the picture change in 
different ways. It is important to recognize these different parts and 
compensate the intensity changes by appropriately chosing (15a) or 
(15b). We discuss an algorithm in the next section to identify these 
different parts and use appropriate compensation for them. 

The compensation for the above variation of intensity can be accom- 
plished by estimating p u p 2 , and D using gradient-type algorithms as 
before. The algorithm for compensation of (15a) is: 

pi* 1 = p\ + ei[/(x, t) - p!/(x, t - t)]/(x, t - t). (16) 

For compensation of (15b), both p 2 and D need to be estimated. This 
is accomplished by 

pY» = p< + C1 DFD(X, p 2 , D')/(X - t)\ t - T) (17) 

D' +1 = D' - e 2 p 2 DFD(x, p 2 , D')V/(x - ft', t - t), (18) 

where dfd(-, •, •) is defined by 

dfd(x, p 2 , D) = /(x, t) - p 2 /(x -f>,t-r) (19) 

and £] and e 2 are positive scalar constants. The iterations may proceed 
from sample to sample along a scan line. Use of algorithms (16) to (19) 
assumes that pi, p 2 , and D vary sufficiently slowly spatially— an 
assumption that appears to be often valid in practice. 

We note in passing that eq. (15b) is the appropriate model for 
intensity variation due to translational motion of objects. For uniform 
illumination, p 2 will be unity and D will be equal to the displacement. 
However, we can see from (10) that eq. (15a) also provides a description 
of object motion. Because of this, eq. (15a), with p, obtained from (16), 
in some cases approximates variations of intensity due to object 
translation. This occurs if the parameter pi varies sufficiently slowly 
to be learned by (16). An example of this is shown in Fig. 1, which 
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shows a vertical edge in intensity that is displaced from one frame to 
the next in the horizontal direction by two picture elements. Recur- 
rence relationship (16) is carried out in Fig. 1, where a plot of p Y along 
the scan line is shown. Also shown is the frame difference and the gain- 
compensated frame difference GFDIF: 



GFDIF(x, p u t)= /(x, t) - pj(x, t - t). 



(20) 




Fig. 1— (a) A synthetic vertical edge whose intensity for any scan line is plotted as a 
function of horizontal position for two consecutive frames. The horizontal shift in the 
edge is 2 pels per frame, (b) Plot of the value of p, generated recursively from eq. (16), 
starting with a value of 1.0. Recursion is assumed to proceed pel by pel in the direction 
of scanning, (c) Plot of frame difference (or the prediction error in the conditional 
replenishment coders) and the gain-compensated frame difference (or the prediction 
error in the gain-compensated coders) as a function of pel position. 
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It is seen then that the gain-compensated frame difference is able to 
provide a certain degree of displacement compensation and, therefore, 
results in errors that are less than the frame differences traditionally 
used in conditional replenishment. 

III. CODER SIMULATIONS 

We have performed computer simulations of four different coders. 
Two of these, frame difference conditional replenishment and displace- 
ment compensation, were simulated for comparison purposes. The 
parameters used for these two were identical to those given in our 
earlier paper. 11 The third simulated coder (called gain-compensated 
coder) worked as follows: 

Let the frame difference, FDIF, and the gain-compensated frame 
difference, GFDIF, be defined as 

FDIF(x, t) = /(x, t) - /(x, t-r). (21) 

GFDIF (x, pi,t) = /(x, t) - p,/(x, t - t). (22) 

Then the intensity at pel Z (Fig. 2) is predicted by: 



Pz = 



1(2, t-r), if £ \FDIF(x,t) 

xe{B,C,D) 



< £ \GFDIF(x,t,p 

xe{B,C,D) 



"(\ 



p$I(Z,t-r), otherwise (23) 

where pf is the previously estimated p\ at location C. The iteration for 
P! proceeds along the scan line according to eq. (16). However, to 
simplify multiplications that are necessary for implementing eq. (16), 
the algorithm was modified as 

p\ +l = pi + €isgn[/(x, t) - p',/(x, t - r)l (24) 



where 



sgn(u) = . 



0, if m = 

u 

- — - , otherwise. 

\u\ 



(25) 



In our simulations, pi was constrained to be in the interval [15/16, 17/ 
16], and d was taken to be 1/128. Iterations were carried out pel by pel 
in the direction of the scanning with the last value of pi on a line being 
used as an initial estimate for the next line. This coder requires a 
simple modification of the frame difference conditional replenishment 
coder. A block diagram of the coder is shown in Fig. 3. 

The fourth coder consists of gain and translation compensation used 
simultaneously. In this case, the prediction for the present pel Z of Fig. 
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Fig. 2 — Configuration of pels for gain compensation^ Dashed and solid lines denote 
scan lines in alternate fields. I(Z, t) is predicted by pf I(Z, t — t), where pf is the estimate 
of pi at pel position C. 



2 is made by switching between the three predictors given below, 
where ti and T2 are frame and field intervals, respectively. 



Pl-J(£,t-Ti) 

P 2 = pf/(Z, t - Ti) 

P 3 = P2 C /(Z-D C , t~T 2 ), 



(26) 



where D c , p2 and pf are the estimated translation and gain for pel C 
from the previous line, pf is estimated recursively according to eq. (24) 
with t = Ti. p£, D c are estimated using a simplification of eq. (17) and 
(18): 

p 2 +1 = io 2 + e lS gn[DFD(x, p* 2 , D')] (27) 

D' +1 = fi' - e 2 sgn[DFD(x, fa D')]-sgn[V/(x - D', t - r)], 
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38,576 


15,044 


19,012 


14,130 


94,975 


71,407 


84.401 


64,401 


66,136 


56,548 


40,164 


36,612 



Table I — Average bits per field 
Average bits are computed by averaging over 60 frames of the sequences Judy, Mike 
and Nadine, and Mike and John, for the four coders. The numbers for conditional 
replenishment and displacement compensation are taken from Refs. 1 and 1 1 . 

Conditional Gain and Dis- 

Replenish- Displacement Gain Compen- placement 
Coder Scene ment Compensation sation Compensation 

Judy 

Mike and Nadine 

Mike and John 



where t — j 2 . As in the gain-compensated coder, initial estimates of 
p ls p 2 , and D of the beginning of a line were taken to be the corre- 
sponding estimates at the last pel of the previous line, ei and e 2 are 
taken to be 1/128, and 1/16, respectively. Iteration proceeds pel by pel 
in the direction of scanning. The selection of the predictor for a pel 
from the above three predictors is made by choosing the one that 
resulted in the least error for the adjacent previous line elements. 
Referring to Fig. 2, the selection rule is as follows: 



P z = 



Pi 
Pa 

P 3 



if E i = MIN(E i ,E 2 ,E 3 ), 

if E 2 = MIN(E U E 2 , E 3 ) and (£1 * E 2 ), (29) 

otherwise, 



where 



E l = £ \FDIF(x, t)\ 

xe(B,C,U) 

E 2 = £ \GFDIF(x,t,p?)\ 

xe(fl,C,D) 

£3= £ |DFD(X,p 2 ? ,D C )| 

xt{B,C,D) 

and MIN{ • , • , • ) is the minimum value of its three variables. Using 
the techniques described in Ref. 11, the computation of E\ , E 2 , and E3 
can be significantly simplified by substitution of other FDIF, GFDIF, 
and dfd signals. 

Segmentation of pels into those selected for transmission and those 
dropped from transmission for the first coder is described in Ref. 10 
and 11. For coders 2, 3, and 4, simple segmentation was used by 
thresholding* the magnitude of the prediction error: whenever the 
magnitude of the prediction error exceeded a threshold (3 out of 256), 
prediction error was sent in quantized form; otherwise, it was dropped 
from transmission and the pel was reconstructed at the receiver 



* This type of segmentation is different from that used in coder 1 (conditional 
replenishment) where an attempt is made to form contiguous areas of transmitted pels. 
This increases the number of transmitted pels and consequently bits/field, especially if 
run-length coding is used for addressing. A significant part of the reduction in bit rate by 
gain compensation is due to the different segmentation used. 
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Fig. 4 — Plots of bits/field versus the field number using the four coders for the scenes 
(a) Judy, (b) Mike and Nadine, and (c) Mike and John. The four coders are: (1) frame 
difference conditional replenishment; (2) gain compensation; (3) displacement compen- 
sation; (4) gain and displacement compensation. 
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assuming zero prediction error. Thresholds were adjusted to give a 
good quality picture in which coding distortions were visible but not 
annoying. The quantizer used for all the four coders was 35-level, 
symmetric and companded. It is shown in Fig. 10 of Ref. 10. Bit rates 
were computed by adding the bits necessary to specify the prediction 
error and their addresses. Prediction error bits were approximated by 
multiplying the entropy of the transmitted prediction error by the 
number of unpredictable pels. The addressing bits were calculated by 
standard run length coding (in the direction of the scan line) of the 
unpredictable pels. 

Figures 4a, 4b, and 4c show plots of bit rates with respect to time for 
three 60-frame sequences. The first two sequences, Judy and Mike and 
Nadine, are the same as those used in Ref. 10. The third sequence, 
Mike and John, contains large areas of nonuniform illumination, 
movement of shadows, and the people entering the camera field of 
view (25th frame) and walking briskly around each other. Although 
there are no moving objects in frames to 24, the luminance of a 
considerable area changes as a result of shadows generated by objects 
out of camera view. It should be noted that the second (Mike and 
Nadine) and third sequences are similar; however, Mike and Nadine 
is a panned view of two people who are always in the camera view. 
The percent of moving area (as determined by the "frame difference" 




Fig. 5 — One frame from the scene sequence Mike and John. The speed of each person 
is approximately 5 pels per frame. 
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Fig. 6— Frame difference signal for the frame of Fig. 5. Time-varying illumination 
results in changing "background" intensities in the central region of the frame. 



segmentor of coder 1) for the Mike and John sequence varies between 
53 and 84. Figures 4a, 4b, and 4c has each four curves for the four 
coders. Parameters of each of these four coders were adjusted such 
that the picture quality was approximately equivalent as indicated by 
informal subjective tests made by the authors. Also Table I shows the 
average bit rates (over 60 frames) for the four coders and three scenes. 
For the (head and shoulders) scene Judy, where the illumination is 
close to uniform, the gain-compensated coder results in average bit 
rates that are about 50 percent below those of the frame difference 
conditional replenishment coder. The displacement-compensated 
coder, on the other hand, results in about 61 percent decrease. Thus, 
gain compensation reduces the coder bit rate by a significant amount 
without the increase in the complexity associated with the displace- 
ment-compensated encoder. The gain- and displacement-compensated 
encoder reduce the bit rate by 63.4 percent for Judy. This additional 
decrease due to gain compensation is small, perhaps, as a result of 
relatively uniform iUumination and lack of shadows in the scene. For 
the scene Mike and Nadine, again, most of the decrease in bit rate is 
provided by displacement-compensation and gain-compensation re- 
sults in bit rates that are between the conditional replenishment and 
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displacement compensation. Also, combining gain and displacement 
compensation provides an additional decrease of 7.3 percent. 

The scene Mike and John contains large areas of nonuniform illu- 
mination and moving shadows. The first 25 frames of this sequence 
contain no object motion. Most of the changes in the intensity are a 
result of the moving shadows cast by objects not in the camera view. 
As a result, the bit rates of the conditional replenishment and displace- 
ment compensated coder are similar for the first 25 frames and the 
gain compensation reduces the bit rates significantly. Following the 
25th frame, moving objects enter the camera view, and displacement 
compensation is clearly superior to conditional replenishment. Even 
here, however, gain compensation performs slightly better than dis- 
placement compensation because of the large changes in illumination 
still taking place. Figures 5, 6, and 7 show the input scene, the frame 
difference, and the regions of different predictor usage, respectively, 
for one of the frames of the sequence Mike and John processed by the 
gain- and displacement-compensated coder. It is seen that, in most of 
the areas of shadows, the gain-compensation predictor is used, whereas 
in areas of pure motion the displacement-compensated predictor is 
used. 




Fig. 7 — Regions of different predictor use by the gain- and displacement-compensated 
coder for the frame of Fig. 5. Bright pels correspond to displacement-compensated 
predictor [Pi of eq. (26)], light pels correspond to gain-compensated predictor [P 2 of eq. 
(26)], and dark pels correspond to previous frame predictor [Pi of eq. (26)]. It is seen 
that unchanging background areas use the previous frame predictor, and most of the 
shadow regions use the gain-compensated predictor. 
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IV. CONCLUSIONS 

We have presented algorithms for estimating parameters associated 
with certain common types of frame to frame intensity variations. The 
parameters, gain and displacement, are estimated by recursive algo- 
rithms using information previously transmitted by the coder. Using 
the estimates of these parameters, adaptive predictors and a segmentor 
to determine which pels need to be transmitted are synthesized for an 
interframe dpcm coder. Computer simulations using three scenes con- 
taining 60 frames each indicate that, compared to conditional replen- 
ishment, the decrease in bit rates using gain compensation is between 
11 and 51 percent; using displacement compensation is between 15 and 
61 percent, and using gain and displacement is between 32 and 63 
percent. These decreases are a function of the type of intensity 
variation in a scene, but may be typical for many videoconferencing 
and broadcasting applications. 
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